perm filename CIRCUM[F83,JMC]4 blob
sn#752002 filedate 1984-04-27 generic text, type C, neo UTF8
COMMENT ⊗ VALID 00015 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00003 00002 .<<circum[f83,jmc]>>
C00005 00003 .skip to column 1
C00013 00004 #. %3A typology of uses of non-monotonic reasoning%1
C00023 00005 #. %3Minimizing abnormality%1
C00025 00006 #. %3Whether birds can fly%1
C00038 00007 #. %3The unique names hypothesis%1
C00050 00008 #. %3Two examples of Raymond Reiter%1
C00054 00009 #. %3A More general treatment of an %2is-a%1 hierarchy%1
C00058 00010 #. %3The blocks world%1
C00061 00011 #. %3An example of doing the circumscription.%1
C00066 00012 #. %3General considerations and remarks%1
C00072 00013 %3Appendix A%1
C00083 00014 %3Appendix B%1
C00090 00015 #. %3References:%1
C00093 ENDMK
C⊗;
.<<circum[f83,jmc]>>
.require "memo.pub[let,jmc]" source;
.at "q1" ⊂"%41%*"⊃
.at "q2" ⊂"%42%*"⊃
.at "qi" ⊂"%4i%*"⊃
.at "qm" ⊂"%4n%*"⊃
.every heading(,Applications of Circumscription,{page})
.cb Applications of Circumscription to Formalizing Common Sense Knowledge
.cb "by John McCarthy, Stanford University"
.skip 5
Abstract: We present a new and more symmetric version of the
circumscription method of non-monotonic reasoning (McCarthy 1980) and some
applications to formalizing common sense knowledge. The applications in
this paper are based on minimizing the abnormality of the different
aspects of a situation. Included are non-monotonic treatments of ⊗is-a
hierarchies, the unique names hypothesis, and the frame problem.
.skip 5
Descriptors: non-monotonic logic, common sense, artificial intelligence.
.skip to column 1
.double space
#. %3Introduction and new definition of circumscription%1
(McCarthy 1980) introduces the circumscription method of
non-monotonic reasoning and gives motivation, some mathematical properties
and some examples of its application. While the present paper is
logically self-contained, motivation may require some acquaintance with
the earlier paper. In particular we don't repeat its arguments about the
importance of non-monotonic reasoning in AI. Also its examples are
instructive.
Here we give a more symmetric definition of
circumscription and applications to the formal expression of common
sense facts. Our goal is to express these facts in a way that would
be suitable for inclusion in a general purpose database of common
sense facts. We imagine this database to be used by AI programs
written after the initial preparation of the database. It would
be best if the writers of these programs didn't have to
be familiar with how the common sense facts about particular
phenomena are expressed.
Thus
common sense knowledge must be represented in a way that is not
specific to a particular application.
It turns out that many such common sense facts can be formalized
in a uniform way. A single predicate ⊗ab, standing for "abnormal" is
circumscribed with certain other predicates and functions considered as
variables that can be constrained to achieve the circumscription subject
to the axioms. So far this seems to cover the use of circumscription to
represent default rules.
On the other hand, it doesn't seem that circumscribing abnormality can
be used to cover many of the examples of (McCarthy 1980) in which we
want to limit a set to those objects that must be members.
#. A new version of circumscription.
.skip 1
%3Definition:%1 Let %2A(P)%1 be a formula of second order logic, where ⊗P is
a tuple of some of the free predicate symbols in ⊗A(P).
Let ⊗E(P,x) be a wff in which
⊗P and a tuple ⊗x of individual variables occur free. The circumscription
of ⊗E(P,x) relative to ⊗A(P) is the formula ⊗A'(P) defined by
!!a1: %2A(P) ∧ ∀P'.[A(P') ∧ [∀x.E(P',x) ⊃ E(P,x)] ⊃ [∀x.E(P',x) ≡ E(P,x)]].%1
[We are here writing ⊗A(P) instead of %2A(Pq1,...,Pqm)%1 for brevity
and likewise writing ⊗E(P,x) instead of
%2E(Pq1,...,Pqm,xq1,...,xqm)%1.
Likewise the quantifier ⊗∀x stands for %2∀xq1...xqm%1.
In principle ⊗A(P) may have embedded quantifiers, but we have no applications
yet that use this generality. Circumscription is a kind of minimization,
and the predicate symbols in ⊗A(P) that are not in ⊗P itself act as
parameters in this minimization.
.skip 1
There are two differences between this and (McCarthy 1980). First,
in that paper ⊗E(P,x) had the specific form ⊗P(x). Here we speak of
circumscribing a wff, while there we could speak of circumscribing a predicate.
We can still speak of circumscribing the predicate ⊗P when ⊗E(P,x) has
the special form ⊗P(x). The present form is more symmetric in that
all of the predicate symbols in ⊗P are regarded as variables, and a wff
is minimized, whereas the earlier form distinguishes one of the predicates
themselves for minimization. However, the present form is reducible to
the earlier form.
Second, in ({eq a1}) we use an explicit quantifier for the predicate
variable ⊗P' whereas in (McCarthy 1980), the formula was a schema.
One advantage of the present formalism is that now ⊗A'(P) is the
same kind of formula as ⊗A(P) and can be used as part of the axiom for
circumscribing some other wff.
Remark: The predicate symbols %2Pq1,...,Pqm%1 will not usually be all the
predicate symbols occurring in ⊗A(P).
The situation is analogous to a minimization problem in calculus
or calculus of variations. Some quantity is being minimized. Other
quantities are variables whose values are to be chosen so as to achieve
the minimum. Still others are parameters. They are not varied
in the minimization process, and therefore the minimum obtained and the
values of the minimizing variables depend
on the values of the parameters. We will point out the variables and
the parameters in the subsequent examples.
In some of the literature, it has been supposed that non-monotonic
reasoning involves giving all predicates their minimum extension. This
mistake has led to theorems about what reasoning cannot be
done that are irrelevant to AI and database theory,
because their premisses are too narrow.
.skip 2
#. %3A typology of uses of non-monotonic reasoning%1
Each of the several papers that introduces a mode of non-monotonic
reasoning seems to have a particular application in mind. Perhaps we are
looking at different parts of an elephant. Therefore, before proceeding
to applications of circumscription I want to suggest a typology of uses of
non-monotonic reasoning. The orientation is towards circumscription,
but I suppose the considerations apply to other formalisms as well.
Non-monotonic reasoning has several uses.
1. As a communication convention. Suppose A tells B about
a situation involving a bird. If the bird cannot fly, and this
is relevant, then A must
say so. Whereas if the bird can fly, there is
no requirement to mention the fact. For example, if I hire you
to build me a bird cage and you don't put a top on it, I can
get out of paying for it even if you tell the judge that I never
said my bird could fly. However, if I complain that you wasted
money putting a top on a cage I intended for a penguin, the judge
will agree with you that if the bird couldn't fly I should have
said so.
2. As a database or information storage convention. It may be a
convention of a particular database that certain predicates have
their minimal extension. This generalizes the closed world
assumption. When a database makes the closed world assumption
for all predicates it is reasonable to imbed this fact in the
programs that use the database.
However, when only some predicates are to be minimized,
we need to say which ones by appropriate
sentences of the database, perhaps as a preamble
to the collection of ground sentences that usually constitute the
main content.
Neither 1 nor 2 requires that most birds can fly.
Should it happen that most birds that are subject to the communication
or about which information is requested from the data base cannot fly, the
convention may lead to inefficiency but not incorrectness.
3. As a rule of conjecture. This use was emphasized
in (McCarthy 1980). The circumscriptions may be regarded as expressions of some
probabilistic notions such as "most birds can fly" or they may be
expressions of standard cases. Thus it is simple to conjecture that
there are no relevant present
material objects other
than those whose presence can be inferred. It is also
a simple conjecture that a tool asserted to be present is usable
for its normal function. Such conjectures sometimes conflict, but there
is nothing wrong with having incompatible conjectures on hand. Besides
the possibility of deciding that one is correct and the other wrong, it
is possible to use one for generating possible exceptions to the other.
4. As a representation of a policy. The example is Doyle's "The meeting
will be on Wednesday unless another decision is explicitly made".
Again probabilities are not involved.
5. As a very streamlined expression of probabilistic information when
numerical probabilities, especially conditional probabilities, are
unobtainable. Since circumscription doesn't provide numerical probabilities,
its probabilistic interpretation involves
probabilities that are either infinitesimal, within an
infinitesimal of one, or intermediate - without any discrimination
among the intermediate values. The circumscriptions give conditional
probabilities. Thus we may treat the probability that a bird
can't fly as an infinitesimal. However, if the rare
event occurs that the bird is a penguin, then the conditional probability that
it can fly is infinitesimal, but we may hear of some rare condition
that would allow it to fly after all.
Why don't we use finite
probabilities combined by the usual laws? That would be fine
if we had the numbers, but circumscription is usable when we can't
get the numbers or find their use inconvenient. Note that the
general probability that a bird can fly may be irrelevant, because
we are interested in the facts that influence our opinion about
whether a particular bird can fly in a particular situation.
Moreover, the use of probabilities is normally considered
to require the definition of a sample space, i.e. the space of all
possibilities. Circumscription allows one to conjecture that the
cases we know about are all that there are. However, when additional
cases are found, the axioms don't have to be changed. Thus there
is no fixed space of all possibilities.
Notice also that circumscription does not provide for
weighing evidence; it is appropriate
when the information permits snap decisions. However, many
cases nominally treated in terms of weighing information are in fact
cases in which the weights are such that circumscription and other
defaults work better.
6. Auto-epistemic reasoning. "If I had an elder brother, I'd know it".
This has been studied by R. Moore. Perhaps it can be handled by
circumscription.
7. Both common sense physics and common sense psychology use non-monotonic
rules. An object will continue in a straight line if nothing interferes
with it. A person will eat when hungry unless something prevents it.
Such rules are open ended about what might prevent the expected behavior,
and this is required, because we are always encountering unexpected
phenomena that modify the operation of our rules. Science, as distinct
from common sense, tries to work with exceptionless rules. However,
this means that common sense reasoning has to decide when a scientific
model is applicable, i.e. that there are no important phenomena not
taken into account by the theories being used and the model of the
particular phenomena.
Seven different uses for non-monotonic reasoning seem
too many, so perhaps we can condense later.
.skip 2
#. %3Minimizing abnormality%1
Many people have proposed representing facts about what is
"normally" the case. One problem is that every object is abnormal
in some way, and we want to allow some aspects of the object to be
abnormal and still assume the normality of the rest.
We do this with a predicate ⊗ab standing for "abnormal". We circumscribe
%2ab_z%1. The argument of ⊗ab will be some aspect of the entities
involved. Some aspects can be abnormal without affecting others. The
aspects themselves are abstract entities, and their unintuitiveness is
somewhat a blemish on the theory.
There are many applications.
.skip 2
#. %3Whether birds can fly%1
Marvin Minsky (1982) offered expressing the facts and
non-monotonic reasoning concerning the ability of birds to fly
as a challenge to advocates of formal systems based on mathematical
logic.
There are many ways of non-monotonically axiomatizing the
facts about which birds can fly. The following set of axioms using ⊗ab seems
to me quite straightforward.
!!a4a: %2∀x.¬ab aspect1 x ⊃ ¬flies x%1.
Unless an object is abnormal in ⊗aspect1, it can't fly. (We're using a
convention that parentheses may be omitted for functions and predicates
of one argument, so that ({eq a4a}) is the same as %2∀x.(¬ab(aspect1(x)
⊃ ¬flies(x))%1.)
Note that it wouldn't work to write %2ab x%1 instead of %2ab aspect1 x%1,
because we don't want a bird that is abnormal with respect to its ability
to fly to be automatically abnormal in other respects. Using aspects limits
the effects of proofs of abnormality.
!!a5: %2∀x.bird x ⊃ ab aspect1 x%1.
A bird is abnormal in ⊗aspect1, so ({eq a4a}) can't be used to show it
can't fly. If ({eq a5}) were omitted, when we did the circumscription we
would only be able to infer a disjunction. Either a bird is abnormal in
%2aspect1%1 or it can fly unless it is abnormal in ⊗aspect2. ({eq a5})
establishes expresses our preference for inferring that a bird is abnormal
in ⊗aspect1 rather than ⊗aspect2. We call ({eq a5}) a %2cancellation of
inheritance%1 axiom. We will see more of them.
!!a6: %2∀x.bird x ∧ ¬ab aspect2 x ⊃ flies x%1.
Unless a bird is abnormal in ⊗aspect2, it can fly.
!!a7: %2∀x.ostrich x ⊃ ab aspect2 x%1.
Ostriches are abnormal in ⊗aspect2. This doesn't say that an ostrich cannot
fly - merely that ({eq a6}) can't be used to infer that it does. ({eq a7})
is another cancellation of inheritance axiom.
!!a8: %2∀x.penguin x ⊃ ab aspect2 x%1.
Penguins are also abnormal in ⊗aspect2.
!!a9: %2∀x.ostrich x ∧ ¬ab aspect3 x ⊃ ¬flies x%1.
!!a10: %2∀x.penguin x ∧¬ab aspect4 x ⊃ ¬flies x%1.
Normally ostriches and penguins can't fly. However, there is
an out. ({eq a9}) and ({eq a10}) provide that under unspecified conditions,
an ostrich or penguin might fly after all. If we give no
such conditions, we will conclude that an ostrich or penguin
can't fly. Additional objects that can fly may be specified.
Each needs two axioms. The first says that it is abnormal in
⊗aspect1 and prevents ({eq a4a}) from being used to say that it can't
fly. The second provides that it can fly unless it is abnormal
in yet another way. Additional non-flying birds can also be
provided for at a cost of two axioms per kind.
We haven't yet said that ostriches and penguins are birds,
so let's do that and throw in that canaries are birds also.
!!a11: %2∀x.ostrich x ⊃ bird x%1
!!a12: %2∀x.penguin x ⊃ bird x%1
!!a13: %2∀x.canary x ∧ ¬ab aspect5 x ⊃ bird x%1.
We threw in ⊗aspect5 just in case one wanted to use the term canary
in the sense of a 1930s gangster movie.
Asserting that ostriches, penguins and canaries are birds will help
inherit other properties from the class of birds. For example, we have
!!a14: %2∀x. bird x ∧ ¬ab aspect6 x ⊃ feathered x%1.
So far there is nothing to prevent ostriches, penguins and canaries
from overlapping. We could write disjointness axioms like
!!a15: %2∀x. ¬ostrich x ∨ ¬penguin x%1,
but we would require %2n%52%1 of them if we have ⊗n species.
This problem is like the unique names problem.
If these are the only facts to be taken into account, we
must somehow specify that what can fly is to be determined
by circumscribing the wff ⊗ab_z using ⊗ab and ⊗flies as variables.
Why exactly these? If ⊗ab were not taken as variable, ⊗ab_z couldn't
vary either, and the minimization problem would go away. Since
the purpose of the axiom set is to describe what flies, the predicate
⊗flies must be varied also. Suppose we contemplate taking ⊗bird as
variable also. In the first place, this violates an intuition that
deciding what flies follows deciding what is a bird in the common
sense situations we want to cover. Secondly, if we use exactly the
above axioms and admit bird as a variable, we will further conclude
that the only birds are penguins, canaries and ostriches. Namely,
for these entities something has to be abnormal, and therefore minimizing
⊗ab_z will involve making as few entities as possible penguins, canaries
and ostriches. If we also admit ⊗penguin, ⊗ostrich, and ⊗canary as
variable, we will succeed in making ⊗ab_z always false, and there will be
no birds at all.
However, if the same circumscriptions are done with additional
axioms like ⊗canary_Tweety and ⊗ostrich_Joe, we will get the
expected result that Tweety can fly and Joe cannot even if all the
above are variable.
While this works it may be more straightforward, and
therefore less likely to lead to subsequent trouble, to circumscribe
birds, ostriches and penguins with axioms like
!!a2: %2∀x.¬ab aspect8 x ⊃ ¬bird x%1, etc.
We have not yet specified how a program will know what to
circumscribe. One extreme is to build it into the program, but
this is contrary to the declarative spirit. However, a statement
of what to circumscribe isn't just a sentence of the language
because of its non-monotonic character. My present, admittedly
somewhat unsatisfactory, idea is to include some sort of metamathematical
statement like
!!a16: %2circumsribe(ab z; ab, flies, bird, ostrich, penguin)%1
in a "policy" database available to the program. ({eq a16}) is
intended to mean that ⊗ab_z is to be circumscribed with ⊗ab, ⊗flies,
⊗bird, ⊗ostrich and ⊗penguin taken as variable. Explicitly listing
the variables makes adding new kinds awkward, since they will have
to be mentioned in the ⊗circumscribe statement.
.if false then begin
!!a2a1: %2∀ab flies.flying-facts(ab,flies) ≡
.begin nofill
∂8∧ (∀x.¬ab aspect1 x ⊃ ¬flies x)
∂8∧ (∀x.bird x ⊃ ab aspect1 x)
∂8∧ (∀x.bird x ∧ ¬ab aspect2 x ⊃ flies x)
∂8∧ (∀x.ostrich x ⊃ ab aspect2 x)
∂8∧ (∀x.penguin x ⊃ ab aspect2 x)
∂8∧ (∀x.ostrich x ∧ ¬ab aspect3 x ⊃ ¬flies x)
∂8∧ (∀x.penguin x ∧¬ab aspect4 x ⊃ ¬flies x)
∂8∧ (∀x.ostrich x ⊃ bird x)
∂8∧ (∀x.penguin x ⊃ bird x)
∂8∧ (∀x.canary x ∧ ¬ab aspect5 x ⊃ bird x%1).
.end
.skip 1
({eq a2}) is the circumscription, and ({eq a2a1}) defines the
predicate %2flying-facts%1 as the conjunction of axioms
({eq a4a}...{eq a13}). This ties down precisely
what is to be varied. However, it complicates the nonmonotonicity,
because new facts to be included in the circumscription go within
the quantifier %2∀ab_flies%1 of ({eq a2a1}). This is not in accordance
with what was previously advertised about how nonmonotonic axioms systems
work. It doesn't offer any additional technical difficulties, however.
.end
.if false then begin
Therefore, we write axioms
!!a16: %2∀x. ostrich x ≡ belongs(x,'ostrich)%1
and similarly for penguins, canaries and birds.
We can now write
!!a17: %2∀n1 n2. n1 ≠ n2 ∧ ¬ab aspect8(n1,n2) ⊃ ∀x.¬(belongs(x,n1) ∧ belongs(x,n2)%1.
This doesn't do everything we want, because we have allowed a canary ⊗x
two choices. It can be a bird avoiding %2ab_aspect5_x%1 or it can be
a non-bird avoiding %2aspect8('canary,'bird)%1. We want the former to
have priority and can do it simply by writing
!!a18: %2ab aspect8('canary,bird)%1.
In what follows we will need uniqueness axioms for the aspects,
and they are a nuisance to write. They could be written
!!a18a: %2∀n1 n2 x1 x2.aspect(n1,x1) = aspect(n2,x2) ≡ n1 = n2 ∧ x1 = x2%1,
if we make ⊗n a parameter, i.e. write ⊗aspect(2,x) instead of ⊗aspect2_x, but
even this doesn't to everything we need, because it doesn't cover aspects
that take more than one argument. Aspects with many arguments can be reduced to
one argument aspects by assuming them to operate on tuples. For this paper,
we won't make this change of notation and merely postulate that all aspects
are distinct objects, assuming as many axioms as necessary.
.end
.skip 1
#. %3The unique names hypothesis%1
Raymond Reiter (1979) introduced the phrase "unique names hypothesis"
for the assumption each object has a unique name, i.e.
that distinct names denote distinct objects. We want
to treat this non-monotonically. Namely, we want a wff that picks out
those models of our initial assumptions maximize the inequality of
the denotations of constant symbols.
While we're at it, we might as well try for something
stronger. We want to maximize the extent to which distinct terms designate
distinct objects. When there is a unique model of the axioms that maximizes
distinctness, we can put it more simply; two terms denote distinct objects
unless the axioms force them to denote the same.
If we are even more
fortunate, as we are in the examples to be given, we can say that two
terms denote distinct objects unless their equality is provable.
We don't know a completely satisfactory way of doing this.
Suppose that we have a language ⊗L and a theory ⊗T
consisting of the consequences of a formula ⊗A.
It would be most pleasant if we could just circumscribe equality,
but as Reiter and Etherington (to be published) point out, this doesn't
work, and nothing similar works. We could hope to circumscribe
some other formula of the ⊗L, but this doesn't seem
to work either. Failing that, we could hope for some other second
order formula taken from ⊗L that would express the unique names
hypothesis, but we don't presently see how to do it.
Our solution involves extending the language by introducing
the names themselves as the only objects. All assertions about objects
are expressed as assertions about the names.
We suppose our theory is such
that the names themselves
are all provably distinct. There are several ways of doing
this. Let the names be ⊗n1, ⊗n2, etc. The simplest solution is
to have an axiom %2ni_≠nj%1 for each pair of distinct names. This
requires a number of axioms proportional to the square of the number
of names, which is sometimes objectionable. The next solution involves
introducing an arbitrary ordering on the names. We have special
axioms %2n1_<_n2, n2_<_n3, n3_<_n4%1, etc. and the general axioms
%2∀x_y.x_<_y_⊃_x_≠_y%1 and %2∀x_y_z._x_<_y_∧_y_<_z_⊃_x_<_z%1. This
makes the number of axioms proportional to the number of names.
A third possibility involves mapping the names onto integers with
axioms like %2index_n1_=_1, index_n2_=_2%1, etc. and using a theory
of the integers that provides for their distinctness. The fourth
possibility involves using string constants for the names and
"attaching" to equality in the language a subroutine that computes
whether two strings are equal. If our names were quoted symbols as
in LISP, this amounts to having %2'a_≠_'b%1 and all its countable
infinity of analogs as axioms. Each of these devices is useful
in appropriate circumstances.
From the point of view of mathematical logic, there is no harm in
having an infinity of such axioms. From the computational point of view
of a theorem proving or problem solving program, we merely suppose that we
rely on the computer to generate the assertion that two names are distinct
whenever this is required, since a subroutine can easily tell whether two
strings are the same.
Besides axiomatizing the distinctness of the constants, we also
want to axiomatize the distinctness of terms. This may be accomplished
by providing for each function two axioms. Letting ⊗foo be a function
of two arguments we postulate
!!n1: %2∀x1 x2 y1 y2.foo(x1,y1) = foo(x2,y2) ⊃ x1 = x2 ∧ y1 =y2%1
and
!!n2: %2∀x y.fname foo(x,y) = 'foo%1.
The first axiom ensures that unless the arguments of ⊗foo are identical,
its values are distinct. The second ensures that the values of ⊗foo
are distinct from the values of any other function or any constant,
assuming that we refrain from naming any constant ⊗'foo.
These axioms amount to making our domain isomorphic to
the Herbrand universe of the language.
Now that the names are guaranteed distinct, what about the
objects they denote? We introduce a
predicate ⊗e(x,y) and axiomatize it to be an equivalence relation.
Its intended interpretation is that the names ⊗x and ⊗y denote the same
object. We then formulate all our usual axioms in terms of names
rather than in terms of objects. Thus ⊗on(n1,n2) means that the object
named by ⊗n1 is on the object named by ⊗n2, and ⊗bird_x means that the
name ⊗x denotes a bird. We add axioms of substitutivity for ⊗e with
regard to those predicates and functions
that are translates of predicates referring
to objects rather than predicates on the names themselves. Thus we
may have axioms
!!a4: %2∀n1 n2 n1' n2'.e(n1,n1') ∧ e(n2,n2') ⊃ (on(n1,n2) ≡ on(n1',n2')%1,
and
!!n3: %2∀x1 x2 y1 y2.e(x1,x2) ∧ e(y1,y2) ⊃ e(foo(x1,y1),foo(x2,y2))%1.
If for some class ⊗C of names, we wish to assert the unique
names hypothesis, we simply use an axiom like
!!a4c: %2∀n1 n2.n1 ε C ∧ n2 ε C ⊃ (e(n1,n2) ≡ n1 = n2)%1.
However, we often want only to assume that distinct names denote
distinct objects when this doesn't contradict our other assumptions.
In general, our axioms won't permit making all names distinct simultaneously,
and there will be several models with maximally distinct objects. The
simplest example is obtained by circumscribing ⊗e(x,y) while adhering
to the axiom
!!a4b: %2e(n1,n2) ∨ e(n1,n3)%1
where ⊗n1, ⊗n2, and ⊗n3 are distinct names. There will then be two
models, one satisfying ⊗e(n1,n2)_∧_¬e(n1,n3) and the other satisfying
⊗¬e(n1,n2)_∧_e(n1,n3).
Thus circumscribing ⊗e(x,y) maximizes uniqueness of names.
If we only want unique names for some class ⊗C of names, then we
circumscribe the formula %2x_ε_C_∧_y_ε_C_⊃_e(x,y)%1.
An
example of such a circumscription is given in Appendix B. However, there
seems to be a price. Part of the price is admitting names as objects.
Another part is admitting the predicate ⊗e(x,y) which is substitutive
for predicates and functions
of names that really are about the objects denoted by
the names. ⊗e(x,y) is not to be taken as substitutive for predicates
on names that aren't about the objects. Of these our only present example
is equality. Thus we don't have
* %2∀n1 n2 n1' n2'.e(n1,n1') ∧ e(n2,n2') ⊃ (n1 = n2) ≡ n1' = n2')%1.
The awkward part of the price is that we must refrain from any
functions whose values are the objects themselves rather than names.
They would spoil the circumscription by not allowing us to infer the
distinctness of the objects denoted by distinct names. Actually, we
can allow them provided we don't include the axioms involving them
in the circumscription. Unfortunately, this spoils the other property
of circumscription that lets us take any facts into account.
The examples of the use of circumscription in AI in the rest
of the paper don't interpret the variables as merely ranging over
names. Therefore, they are incompatible with getting unique names
by circumscription as described in this section. Presumably it wouldn't
be very difficult to revise those axioms for compatibility with the
present approach to unique names.
.skip 1
#. %3Two examples of Raymond Reiter%1
Reiter asks about representing, "Quakers are normally pacifists
and Republicans are normally non-pacifists. How about Nixon, who is
both a Quaker and a Republican?" Systems of
non-monotonic reasoning that use non-provability as a basis for
inferring negation will infer that Nixon is neither a pacifist
nor a non-pacifist. Combining these conclusions with the original
premiss leads to a contradiction.
We use
!!c1: %2∀x.quaker x ∧ ¬ab aspect1 x ⊃ pacifist x%1,
!!c2: %2∀x.republican x ∧ ¬ab aspect2 x ⊃ ¬ pacifist x%1
and
!!c3: %2quaker Nixon ∧ republican Nixon%1.
When we circumscribe ⊗ab_z using these three sentences as
⊗A(ab), we will only be able to conclude that Nixon is either abnormal
in ⊗aspect1 or in ⊗aspect2, and we will not be able to say whether he
is a pacifist. Of course, this is the same conclusion as would be reached
without circumscription. The point is merely that we avoid contradiction.
Reiter's second example is that a person normally lives in the same
city as his wife and in the same city as his employer. But A's wife
lives in Vancouver and A's employer is in Toronto.
We write
!!c4: %2∀x.¬ab aspect1 x ⊃ city x = city wife x%1
and
!!c5: %2∀x.¬ab aspect2 x ⊃ city x = city employer x%1.
If we have
!!c6: %2city wife A = Vancouver ∧ city employer A = Toronto ∧ Toronto ≠ Vancouver%1,
we will again only be able to conclude that A lives either in Toronto
or Vancouver. In this circumscription, the function ⊗city must be taken
as variable. This might be considered not entirely satisfactory. If one knows
that a person either doesn't live in the same city as his wife or doesn't live
in the same city as his employer, then there is an increased probability that
he doesn't live in the same city as either. A system that did reasoning of
this kind would seem to require a larger body of facts and perhaps more
explicitly metamathematical reasoning. Not knowing how to do that, we might
want to use %2aspect1_x%1 in both ({eq c4}) and ({eq c5}). Then once we know
we would conclude nothing about his city once we knew that he wasn't in the
same city as either.
.skip 2
#. %3A More general treatment of an %2is-a%1 hierarchy%1
The bird example works fine when a fixed %2is-a%1 hierarchy
is in question. However, our writing the inheritance cancellation axioms
depended on knowing exactly from what higher level the properties were
inherited. This doesn't correspond to my intuition of how we humans
represent inheritance. It would seem rather that when we say that
birds can fly, we don't necessarily have in mind that an
inheritance of inability to fly from things in general is being cancelled.
We can formulate inheritance of properties in a more general way provided
we reify the properties. Presumably there are many ways of doing this,
but here's one that seems to work.
The first order variables of our theory range over classes of
objects (denoted by ⊗c with numerical suffixes), properties (denoted
by ⊗p) and objects (denoted by ⊗x). We don't identify our classes
with sets (or with the classes of GB set theory). In particular, we don't
assume extensionality. We have several predicates:
⊗ordinarily(c,p) means that objects of class ⊗c ordinarily have
property ⊗p. ⊗c1_≤_c2 means that class ⊗c1 ordinarily inherits from
class ⊗c2. We assume that this relation is transitive. ⊗in(x,c) means
that the object ⊗x is in class ⊗c. ⊗ap(p,x) means that property ⊗p
applies to object ⊗x. Our axioms are
!!d0: %2∀c1 c2 c3. c1 ≤ c2 ∧ c2 ≤ c3 ⊃ c1 ≤ c3%1,
!!d1: %2∀c1 c2 p.ordinarily(c2,p) ∧ c1 ≤ c2 ∧ ¬ab aspect1(c1,c2,p)
⊃ ordinarily(c1,p)%1,
!!d2: %2∀c1 c2 c3 p.c1 ≤ c2 ∧ c2 ≤ c3 ∧ ordinarily(c2, not p)
⊃ ab aspect1(c1,c3,p)%1,
!!d3: %2∀x c p.in(x,c) ∧ ordinarily(c,p) ∧ ¬ab aspect3(x,c,p) ⊃ ap(p,x)%1,
!!d4: %2∀x c1 c2 p.in(x,c1) ∧ c1 ≤ c2 ∧ ordinarily(c1,not p)
⊃ ab aspect3(x,c2,p)%1.
({eq d0}) is the afore-mentioned transitivity of ≤. ({eq d1}) says
that properties that ordinarily hold for a class are inherited unless
something is abnormal. ({eq d2}) cancels the inheritance if there
is an intermediate class for which the property ordinarily doesn't
hold. ({eq d3}) says that properties which ordinarily hold actually
hold for elements of the class unless something is abnormal. ({eq d4})
cancels the effect of ({eq d3}) when there is an intermediate class
for which the negation of the property ordinarily holds. Notice that
this reification of properties seems to require imitation boolean
operators. Such operators are discussed in (McCarthy 1979).
.skip 2
#. %3The blocks world%1
The following set of "situation calculus" axioms solves the
frame problem for a blocks world in which blocks can be moved and painted.
Here ⊗result(e,s) denotes the situation that results when event ⊗e occurs
in situation ⊗s. The formalism is approximately that of (McCarthy and Hayes
1969).
!!b1: %2∀x e s.¬ab aspect1(x,e,s) ⊃ location(x,result(e,s)) = location(x,s)%1.
!!b2: %2∀x e s.¬ab aspect2(x,e,s) ⊃ color(x,result(e,s)) = color(x,s)%1.
Objects change their locations and colors only for a reason.
!!b3: %2∀x l s.ab aspect1(x,move(x,l),s)%1
and
%2∀x l s.¬ab aspect3(x,l,s) ⊃ location(x,result(move(x,l),s)) = l%1.
!!b4: %2∀x c s.ab aspect2(x,paint(x,c),s)%1
and
%2∀x c s.¬ab aspect4(x,c,s) ⊃ color(x,result(paint(x,c),s)) = c%1.
Objects change their locations when moved and their colors when painted.
!!b5: %2∀x l s.¬clear(top x,s) ∨ ¬clear(l,s) ∨ tooheavy x ∨ l = top x
⊃ ab aspect3(x,move(x,l),s)%1.
This prevents the
rule ({eq b3}) from being used to infer that an object will move
if it isn't clear or to a destination that isn't clear
or if the object is too heavy. An object also cannot be moved to
its own top.
!!b6: %2∀l s.clear(l,s) ≡ ¬∃x.(¬trivial x ∧ location(x,s) = l)%1.
A location is clear if all the objects there are trivial, e.g. a speck of dust.
!!b7: %2∀x.¬ab aspect7 x ⊃ ¬trivial x%1.
Trivial objects are abnormal in %2aspect7%1.
.skip 2
#. %3An example of doing the circumscription.%1
In order to keep the example short we will take into account
only the following facts from the earlier section on flying.
{eq a4a}) %2∀x.¬ab aspect1 x ⊃ ¬flies x%1.
{eq a5}) %2∀x.bird x ⊃ ab aspect1 x%1.
{eq a6}) %2∀x.bird x ∧ ¬ab aspect2 x ⊃ flies x%1.
{eq a7}) %2∀x.ostrich x ⊃ ab aspect2 x%1.
{eq a9}) %2∀x.ostrich x ∧ ¬ab aspect3 x ⊃ ¬flies x%1.
Their conjunction is taken as %2A(ab,flies)%1. This means
that what entities satisfy ⊗ab and what entities satisfy ⊗flies are
to be verified so as to minimize ⊗ab_z. Which objects are birds
and ostriches are parameters rather than variables, i.e.
what objects are birds is considered given.
We also need an axiom that asserts that the aspects are different.
Here is a straightforward version that would be rather long were there
more than three aspects.
.begin nofill
%2(∀x y.¬(aspect1 x = aspect2 y))
∧ (∀x y.¬(aspect1 x = aspect3 y))
∧ (∀x y.¬(aspect2 x = aspect3 y))
∧ (∀x y.aspect1 x = aspect1 y ≡ x =y)
∧ (∀x y.aspect2 x = aspect2 y ≡ x =y)
∧ (∀x y.aspect3 x = aspect3 y ≡ x =y).%1
.end
The circumscription formula ⊗A'(ab,flies) is then
!!a19: %2A(ab,flies)
∧ ∀ab' flies'.[A(ab',flies') ∧ [∀x. ab' x ⊃ ab x] ⊃ [∀x. ab x ≡ ab' x]]%1,
which spelled out becomes
!!a20: %2[∀x.¬ab aspect1 x ⊃ ¬flies x] ∧ [∀x.bird x ⊃ ab aspect1 x]
∧ [∀x.bird x ∧ ¬ab aspect2 x ⊃ flies x] ∧ [∀x.ostrich x ⊃ ab aspect2 x]
∧ [∀x.ostrich x ∧ ¬ab aspect3 x ⊃ ¬flies x]
∧ ∀ab' flies'.[[∀x.¬ab' aspect1 x ⊃ ¬flies' x] ∧ [∀x.bird x ⊃ ab' aspect1 x]
∧ [∀x.bird x ∧ ¬ab' aspect2 x ⊃ flies' x] ∧ [∀x.ostrich x ⊃ ab' aspect2 x]
∧ [∀x.ostrich x ∧ ¬ab' aspect3 x ⊃ ¬flies' x]
∧ [∀z.ab' z ⊃ ab z]
⊃ [∀z.ab z ≡ ab' z]]%1.
⊗A(ab,flies) is guaranteed to be true, because it is part of what
is assumed in our common sense data base. Therefore ({eq a20}) reduces
to
!!a20a: %2∀ab' flies'.[[∀x.¬ab' aspect1 x ⊃ ¬flies' x] ∧ [∀x.bird x ⊃ ab' aspect1 x]
∧ [∀x.bird x ∧ ¬ab' aspect2 x ⊃ flies' x] ∧ [∀x.ostrich x ⊃ ab' aspect2 x]
∧ [∀x.ostrich x ∧ ¬ab' aspect3 x ⊃ ¬flies' x]
∧ [∀z.ab' z ⊃ ab z]
⊃ [∀z.ab z ≡ ab' z]]%1.
Our objective is now to make suitable substitutions for ⊗ab' and
⊗flies' so that all the terms preceding the ⊃ ({eq a20a}) will be true,
and right side will determine ⊗ab. The axiom ⊗A(ab,flies) will then
determine ⊗flies, i.e. we will know what the fliers are.
⊗flies' is easy, because we need only apply wishful
thinking; we want the fliers to be just those birds that aren't ostriches.
Therefore, we put
!!a21: %2flies' x ≡ bird x ∧ ¬ostrich x%1.
⊗ab' isn't really much more difficult, but there is a notational
problem. We define
!!a22: %2ab' z ≡ [∃x.bird x ∧ z = aspect1 x] ∨ [∃x.ostrich x ∧ z = aspect2 x]%1,
which covers the cases we want to be abnormal.
Appendix A contains a complete proof as accepted by Jussi
Ketonen's (1984) interactive theorem prover EKL. EKL uses the theory of
types and therefore has no problem with the second order logic required by
circumscription.
.skip 2
#. %3General considerations and remarks%1
1. Suppose we have a data base of facts axiomatized by a formalism
involving the predicate ⊗ab. In connection with a particular problem, a
program takes a subcollection of these facts together with the specific
facts of the problem and then circumscribes ⊗ab_z. We get a second order
formula, and in general, as the natural number example of (McCarthy 1980)
shows, this formula is not equivalent to any first order formula.
However, many common sense domains are axiomatizable in such a way that
the circumscription is equivalent to a first order formula. For example,
Vladimir Lifschitz (unpublished 1983)
has shown that this is true if the axioms are a
universal formula and there are no functions. This can presumably be
extended to the case when the ranges and domains of the functions are
disjoint, so that there is no way of generating an infinity of elements.
We can then regard the process of deciding what facts to take
into account and then circumscribing as a process of compiling from
a slightly higher level non-monotonic language into mathematical
logic, especially first order logic. We can also regard natural
language as higher level than logic, although, as I shall discuss
elsewhere, natural language doesn't have an independent reasoning
process, because most natural language inferences involve suppressed
premisses which are not represented in natural language in the minds
of the people doing the reasoning.
Reiter has pointed out, both informally and implicitly in
(Reiter 1982) that circumscription often translates directly into
Prolog program once it has been decided what facts to take into
account.
2. Circumscription has interesting relations to Raymond Reiter's (1980)
logic of defaults. In simple cases they give the same results.
However, computation using default logic involves establishing the
existence of models.
A computer program using default logic would have to establish
the existence of models, perhaps by constructing them, in order to
determine that the sentences mentioned in default rules were consistent.
Such computations are not just selectively applying the rules of
inference of logic but are metamathematical. At present this is
treated entirely informally, and I am not aware of any computer
program that finds models of sets of sentences or even
interacts with a user to find and verify such models.
Circumscription works entirely within the logic as Appendices A
and B illustrate. It can do this, because it uses second order logic to
import some of the model theory of first order formulas into the theory
itself. Finding the right substitution for the predicate variables is, in
the cases we have examined, the same task as finding models of a first
order theory. In our opinion, putting everything into the logic itself is
an advantage.
Notice, however, that finding an interpretation of a language has
two parts - finding a domain and interpreting the predicate and
function letters by predicates and functions on the domain. It
seems that the second is easier to import into second order logic
than the first. This is why our treatment of unique names is awkward.
3. We are only part way to our goal of providing a formalism
in which a database of common sense knowledge can be expressed.
Besides sets of axioms involving ⊗ab, we need ways of specifying
what facts shall be taken into account and what functions and
predicates are to be taken as variable.
.skip 2
%3Appendix A%1
Circumscription in a Proof Checker
EKL is an interactive proof checker for the theory of types.
The present argument makes use of its ability to do second order logic.
Each step begins with a command given by the user. This is usually
followed by the sentence resulting from the step, but this is omitted
for definitions when the information is contained in the command.
We follow each step by a brief explanation.
.skip 1
.begin nofill
%7
1. (DEFINE A
|∀AB FLIES.A(AB,FLIES)≡
(∀X.¬AB(ASPECT1(X))⊃¬FLIES(X))∧(∀X.BIRD(X)⊃AB(ASPECT1(X)))∧
(∀X.BIRD(X)∧¬AB(ASPECT2(X))⊃FLIES(X))∧
(∀X.OSTRICH(X)⊃AB(ASPECT2(X)))∧
(∀X.OSTRICH(X)∧¬AB(ASPECT3(X))⊃¬FLIES(X))| NIL)
This defines the second order predicate %2A(ab,flies)%*, where ⊗ab and
⊗flies are predicate variables. Included here are the specific facts
about flying being taken into account.
;labels: SIMPINFO
2. (AXIOM
|(∀X Y.¬ASPECT1(X)=ASPECT2(Y))∧(∀X Y.¬ASPECT1(X)=ASPECT3(Y))∧
(∀X Y.¬ASPECT2(X)=ASPECT3(Y))∧(∀X Y.ASPECT1(X)=ASPECT1(Y)≡X=Y)∧
(∀X Y.ASPECT2(X)=ASPECT2(Y)≡X=Y)∧(∀X Y.ASPECT3(X)=ASPECT3(Y)≡X=Y)|)
These facts about the distinctness of aspects are used in step 26 only.
Since axiom 2 is labelled SIMPINFO, the EKL simplifier uses it
when appropriate when it is asked to simplify a formula.
3. (DEFINE A1
|∀AB FLIES.A1(AB,FLIES)≡
A(AB,FLIES)∧
(∀AB1 FLIES1.A(AB1,FLIES1)∧(∀Z.AB1(Z)⊃AB(Z))⊃(∀Z.AB(Z)≡AB1(Z)))|
NIL)
This is the circumscription formula itself.
4. (ASSUME |A1(AB,FLIES)|)
deps: (4)
Since EKL cannot be asked (yet) to do a circumscription, we assume the
result. Most subsequent statements list line 4 as a dependency.
This is appropriate since circumscription is a rule of conjecture rather
than a rule of inference.
5. (DEFINE FLIES2 |∀X.FLIES2(X)≡BIRD(X)∧¬OSTRICH(X)| NIL)
This definition and the next say what we are going to substitute for
the bound predicate variables.
6. (DEFINE AB2
|∀Z.AB2(Z)≡(∃X.BIRD(X)∧Z=ASPECT1(X))∨(∃X.OSTRICH(X)∧Z=ASPECT2(X))|
NIL)
The fact that this definition is necessarily somewhat awkward makes
for some difficulty throughout the proof.
7. (RW 4 (OPEN A1))
A(AB,FLIES)∧(∀AB1 FLIES1.A(AB1,FLIES1)∧(∀Z.AB1(Z)⊃AB(Z))⊃(∀Z.AB(Z)≡AB1(Z)))
deps: (4)
This step merely expands out the circumscription formula.
8. (TRW |A(AB,FLIES)| (USE 7))
A(AB,FLIES)
deps: (4)
We separate the two conjuncts of 7 in this and the next step.
9. (TRW |∀AB1 FLIES1.A(AB1,FLIES1)∧(∀Z.AB1(Z)⊃AB(Z))⊃(∀Z.AB(Z)≡AB1(Z))|
(USE 7))
∀AB1 FLIES1.A(AB1,FLIES1)∧(∀Z.AB1(Z)⊃AB(Z))⊃(∀Z.AB(Z)≡AB1(Z))
deps: (4)
10. (RW 8 (OPEN A))
(∀X.¬AB(ASPECT1(X))⊃¬FLIES(X))∧(∀X.BIRD(X)⊃AB(ASPECT1(X)))∧
(∀X.BIRD(X)∧¬AB(ASPECT2(X))⊃FLIES(X))∧(∀X.OSTRICH(X)⊃AB(ASPECT2(X)))∧
(∀X.OSTRICH(X)∧¬AB(ASPECT3(X))⊃¬FLIES(X))
deps: (4)
Expanding out the axiom using the definition ⊗a in step 1.
11. (ASSUME |AB2(Z)|)
deps: (11)
Our goal is step 15, but we need to assume its premiss and then
derive its conclusion.
12. (RW 11 (OPEN AB2))
(∃X.BIRD(X)∧Z=ASPECT1(X))∨(∃X.OSTRICH(X)∧Z=ASPECT2(X))
deps: (11)
We use the definition of %2ab2%*.
13. (DERIVE |AB(Z)| (12 10) NIL)
AB(Z)
deps: (4 11)
Here EKL did a lot of work using its DERIVE command. It took about
ten seconds of computer time on this step.
14. (CI (11) 13 NIL)
AB2(Z)⊃AB(Z)
deps: (4)
We discharge the assumption 11.
15. (DERIVE |∀Z.AB2(Z)⊃AB(Z)| (14) NIL)
∀Z.AB2(Z)⊃AB(Z)
deps: (4)
Universal generalization.
16. (DERIVE |∀X.¬AB2(ASPECT1(X))⊃¬FLIES2(X)| (5 6) NIL)
∀X.¬AB2(ASPECT1(X))⊃¬FLIES2(X)
We need to show that %2ab2%* satisfies the axiom defining ⊗ab. This
takes till step 22, because we have to do the parts separately if
we want DERIVE to do the work for us.
17. (DERIVE |∀X.BIRD(X)⊃AB2(ASPECT1(X))| (5 6) NIL)
∀X.BIRD(X)⊃AB2(ASPECT1(X))
18. (DERIVE |∀X.BIRD(X)∧¬AB2(ASPECT2(X))⊃FLIES2(X)| (5 6) NIL)
∀X.BIRD(X)∧¬AB2(ASPECT2(X))⊃FLIES2(X)
19. (DERIVE |∀X.OSTRICH(X)⊃AB2(ASPECT2(X))| (5 6) NIL)
∀X.OSTRICH(X)⊃AB2(ASPECT2(X))
20. (DERIVE |∀X.OSTRICH(X)⊃¬FLIES2(X)| (5 6) NIL)
∀X.OSTRICH(X)⊃¬FLIES2(X)
Our subgoal was step 21 but DERIVE ran out of push down list when
we tried to do it in one step.
21. (DERIVE |∀X.OSTRICH(X)∧¬AB2(ASPECT3(X))⊃¬FLIES2(X)| (20) NIL)
∀X.OSTRICH(X)∧¬AB2(ASPECT3(X))⊃¬FLIES2(X)
22. (DERIVE
|(∀X.¬AB2(ASPECT1(X))⊃¬FLIES2(X))∧(∀X.BIRD(X)⊃AB2(ASPECT1(X)))∧
(∀X.BIRD(X)∧¬AB2(ASPECT2(X))⊃FLIES2(X))∧(∀X.OSTRICH(X)⊃AB2(ASPECT2(X)))∧
(∀X.OSTRICH(X)∧¬AB2(ASPECT3(X))⊃¬FLIES2(X))| (16 17 18 19 20 21)
NIL)
(∀X.¬AB2(ASPECT1(X))⊃¬FLIES2(X))∧(∀X.BIRD(X)⊃AB2(ASPECT1(X)))∧
(∀X.BIRD(X)∧¬AB2(ASPECT2(X))⊃FLIES2(X))∧(∀X.OSTRICH(X)⊃AB2(ASPECT2(X)))∧
(∀X.OSTRICH(X)∧¬AB2(ASPECT3(X))⊃¬FLIES2(X))
Here we form a conjunction.
23. (UE ((AB.|AB2|) (FLIES.|FLIES2|)) 1 NIL)
A(AB2,FLIES2)≡
(∀X.¬AB2(ASPECT1(X))⊃¬FLIES2(X))∧(∀X.BIRD(X)⊃AB2(ASPECT1(X)))∧
(∀X.BIRD(X)∧¬AB2(ASPECT2(X))⊃FLIES2(X))∧(∀X.OSTRICH(X)⊃AB2(ASPECT2(X)))∧
(∀X.OSTRICH(X)∧¬AB2(ASPECT3(X))⊃¬FLIES2(X))
Now we substitute ⊗ab2 and ⊗flies2 in the definition of ⊗A and get
a result we can compare with step 22.
24. (RW 23 (USE 22))
A(AB2,FLIES2)
We have shown that ⊗ab2 and ⊗flies2 satisfy ⊗A.
25. (DERIVE |∀Z.AB(Z)≡AB2(Z)| (9 15 24) NIL)
∀Z.AB(Z)≡AB2(Z)
deps: (4)
9 was the circumscription formula, and 15 and 24 are its two premisses,
so we can now derive its conclusion. Now we know exactly what entities
are abnormal.
26. (RW 8 ((USE 1 MODE: EXACT) ((USE 25 MODE: EXACT) (OPEN AB2))))
(∀X.¬(∃X1.BIRD(X1)∧X=X1)⊃¬FLIES(X))∧
(∀X.BIRD(X)∧¬(∃X2.OSTRICH(X2)∧X=X2)⊃FLIES(X))∧(∀X.OSTRICH(X)⊃¬FLIES(X))
deps: (4)
We rewrite the axiom now that we know what's abnormal. This gives
a somewhat awkward formula that nevertheless contains the desired
conclusion. The occurrences of equality are left over from the
elimination of the aspects that used the axiom of step 2. You should
see what step 26 looked like before we got that axiom properly
formulated.
27. (DERIVE |∀X.FLIES(X)≡BIRD(X)∧¬OSTRICH(X)| (26) NIL)
∀X.FLIES(X)≡BIRD(X)∧¬OSTRICH(X)
deps: (4)
%1
DERIVE straightens out 26 to put the conclusion in the desired form.
The result is still dependent on the assumption of the correctness of
the circumscription made in step 4.
Clearly if circumscription is to become a practical technique,
the reasoning has to become much more automatic.
.end
.skip 1
%3Appendix B%1
Here is an annotated EKL proof that circumscribes the predicate
%2e(x,y)%1 discussed in section 6.
.skip 1
.begin nofill
%7
;the proof EQUAL:
;This proof uses circumscription to maximize the uniqueness of names, through
;the circumscription of an equivalence relation E(X,Y), which is to be
;interpreted as asserting the equivalence of the objects denoted by X and Y.
;labels: SIMPINFO
1. (AXIOM |INDEX(A)=1∧INDEX(B)=2∧INDEX(C)=3∧INDEX(D)=4|)
;Since EKL does not have attachments to determine the equivalence of names,
;we establish a correspondence between the names in our domain and the integers.
;The label SIMPINFO on this statement and the next indicates that these statements
;should be used by the EKL simplifier whenever appropriate;
;labels: SIMPINFO
2. (AXIOM |¬1=2∧¬1=3∧¬2=3∧¬1=4∧¬2=4∧¬2=4|)
;Unfortunately, EKL does not have a sufficiently sophisticated number theory
;to recognize the uniqueness of integers -- this should not be necessary.
3. (DEFINE EQUIV
|∀E.EQUIV(E)≡
(∀X.E(X,X))∧(∀X Y.E(X,Y)⊃E(Y,X))∧(∀X Y Z.E(X,Y)∧E(Y,Z)⊃E(X,Z))| NIL)
;Here we state the conditions necessary for E to be an equivalence relation.
;It must be reflexive, symmetric, and transitive.
4. (DEFINE WORLD |∀E.WORLD(E)≡E(A,B)∧EQUIV(E)| NIL)
;This axiom describes the world we are considering in this example. There
;is an equivalence relation, E, which holds for two names, A and B.
5. (DEFINE WORLD1
|∀E.WORLD1(E)≡
WORLD(E)∧
(∀E1.WORLD(E1)∧(∀X Y.E1(X,Y)⊃E(X,Y))⊃(∀X Y.E(X,Y)≡E1(X,Y)))| NIL)
;This is the circumscription formula -- we are circumscribing the equivalence
;relation E.
6. (ASSUME |WORLD1(E)|)
;deps: (6)
;Since EKL cannot do the circumscription, we assume the result. Many subsequent
;statements will list 6 as a dependency because of this.
7. (DEFINE E2 |∀X Y.E2(X,Y)≡X=A∧Y=B∨X=Y| NIL)
;This is what we would like E1 to be in WORLD1. The only equivalent pairs
;are [A,B] and pairs in which both elements are the same.
8. (DERIVE |EQUIV(E2)| (NIL) ((OPEN EQUIV) (OPEN E2)))
;EQUIV(E2)
;We infer that E2 is an equivalence relation.
9. (DERIVE |WORLD(E2)| (8) ((OPEN WORLD) (OPEN E2)))
;WORLD(E2)
;The relation E2 satisfies the definition of WORLD.
10. (RW 6 (OPEN WORLD1))
;WORLD(E)∧(∀E1.WORLD(E1)∧(∀X Y.E1(X,Y)⊃E(X,Y))⊃(∀X Y.E(X,Y)≡E1(X,Y)))
;deps: (6)
;Here we expand the circumscription formula.
11. (DERIVE |WORLD(E)| (10) NIL)
;WORLD(E)
;deps: (6)
;This is the first conjunct of the previous statement.
12. (RW 11 (OPEN WORLD))
;E(A,B)∧EQUIV(E)
;deps: (6)
;Expanding the definition of WORLD.
13. (RW 12 (OPEN EQUIV))
;E(A,B)∧(∀X.E(X,X))∧(∀X Y.E(X,Y)⊃E(Y,X))∧(∀X Y Z.E(X,Y)∧E(Y,Z)⊃E(X,Z))
;deps: (6)
;Expanding the definition of EQUIV
14. (DERIVE |∀X Y.E2(X,Y)⊃E(X,Y)| (12 13) (OPEN E11))
;∀X Y.E2(X,Y)⊃E(X,Y)
;deps: (6)
15. (DERIVE |∀E1.WORLD(E1)∧(∀X Y.E1(X,Y)⊃E(X,Y))⊃(∀X Y.E(X,Y)≡E1(X,Y))|
(10) NIL)
;∀E1.WORLD(E1)∧(∀X Y.E1(X,Y)⊃E(X,Y))⊃(∀X Y.E(X,Y)≡E1(X,Y))
;deps: (6)
16. (UE ((E1.|E2|)) 15 NIL)
;WORLD(E2)∧(∀X Y.E2(X,Y)⊃E(X,Y))⊃(∀X Y.E(X,Y)≡E2(X,Y))
;deps: (6)
;In the last three steps, we have proven that E2 satisfies the circumscription
;formula.
17. (DERIVE |∀X Y.E(X,Y)≡E2(X,Y)| (9 14 16) NIL)
;∀X Y.E(X,Y)≡E2(X,Y)
;deps: (6)
;Now we can find the set of equivalent pairs.
18. (RW 17 (OPEN E2))
;∀X Y.E(X,Y)≡X=A∧Y=B∨X=Y
;deps: (6)
19. (DERIVE |E(A,B)| (18) NIL)
;E(A,B)
;deps: (6)
;As before, A and B are equivalent.
20. (DERIVE |¬A=C| (SIMPINFO) NIL)
;¬A=C
;From the axioms at the beginning, we derive the uniqueness of names.
21. (DERIVE |¬B=C| (SIMPINFO) NIL)
;¬B=C
22. (DERIVE |¬E(A,C)| (18 20 21) NIL)
;¬E:A,C)
;deps: (6)
;Here we demonstrate that things not explicitly stated to be equivalent
;in our definition of the world are not equivalent.
.end
.skip 1
#. %3References:%1
.single space
%3Etherington, David W. and Raymond Reiter (1983)%1: "On Inheritance
Hierarchies with Exceptions", in %2Proceedings of the National
Conference on Artificial Intelligence, AAAI-83%1, William Kaufman, Inc.
%3Ketonen, Jussi and Joseph S. Weening (1984)%1: %2EKL - An Interactive
Proof Checker, User's Reference Manual%1, Computer Science Department,
Stanford University.
%3Lifschitz, Vladimir%1(1983): unpublished note on circumscription
%3McCarthy, John and P.J. Hayes (1969)%1: "Some Philosophical Problems from
the Standpoint of Artificial Intelligence", in D. Michie (ed), %2Machine
Intelligence 4%1, American Elsevier, New York, NY.
%3McCarthy, John (1977)%1:
"Epistemological Problems of Artificial Intelligence", %2Proceedings
of the Fifth International Joint Conference on Artificial
Intelligence%1, M.I.T., Cambridge, Mass.
%3McCarthy, John (1979)%1:
"First Order Theories of Individual Concepts and Propositions",
in Michie, Donald (ed.) %2Machine Intelligence 9%1, (University of
Edinburgh Press, Edinburgh).
.<<aim 325,concep[e76,jmc]>>
%3McCarthy, John (1980)%1:
"Circumscription - A Form of Non-Monotonic Reasoning", %2Artificial
Intelligence%1, Volume 13, Numbers 1,2, April.
.<<aim 334, circum.new[s79,jmc]>>
%3Minsky, Marvin (1982)%1: in various lectures including AAAI Presidential
Address though not in its written form.
%3Reiter, Raymond (1980)%1: "A Logic for Default Reasoning", %2Artificial
Intelligence%1, Volume 13, Numbers 1,2, April.
%3Reiter, Raymond (1982)%1: "Circumscription Implies Predicate Completion
(Sometimes)", Proceedings of the National Conference on Artificial
Intelligence, AAAI-82.
This version of circum[f83,jmc] pubbed at {time} on {date}.